Mixture model approach for network forecasting

ABSTRACT

Disclosed are various embodiments that provide a mixture model approach to network forecasting. Network traffic models are generated for multiple host types. Weights are determined for the network traffic models. A network forecast is generated based at least in part on a hardware footprint forecast and on the network traffic models as weighted by the determined weights.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of, and claims priority to,co-pending U.S. patent application entitled “MIXTURE MODEL APPROACH FORNETWORK FORECASTING,” filed on Sep. 26, 2013, and assigned applicationSer. No. 14/037,565, which is incorporated herein by reference in itsentirety.

BACKGROUND

Service forecasts are unreliable for long-term network capacity planningdue to their volatility. Such forecasts may vary widely from week toweek and cannot be reliably employed in managing long-term networkcapacity planning for dates that may be six months or more in thefuture. Reliance on such forecasts may result in poor preparation forunanticipated growth in network traffic.

BRIEF DESCRIPTION OF THE DRAWINGS

Many aspects of the present disclosure can be better understood withreference to the following drawings. The components in the drawings arenot necessarily to scale, emphasis instead being placed upon clearlyillustrating the principles of the disclosure. Moreover, in thedrawings, like reference numerals designate corresponding partsthroughout the several views.

FIG. 1 depicts an example scenario illustrating the use of a mixturemodel for network forecasting according to various embodiments of thepresent disclosure.

FIG. 2 is a drawing of a networked environment according to variousembodiments of the present disclosure.

FIGS. 3A and 3B depict a flowchart illustrating one example offunctionality implemented as portions of a mixture model forecastingengine executed in a computing environment in the networked environmentof FIG. 2 according to various embodiments of the present disclosure.

FIG. 4 is a schematic block diagram that provides one exampleillustration of a computing device employed in the networked environmentof FIG. 2 according to various embodiments of the present disclosure.

DETAILED DESCRIPTION

The present disclosure relates to a mixture model approach for networkforecasting. Both service forecasts and hardware footprint forecasts maybe available for a network. Service forecasts may indicate the demandfor various services provided within the network, e.g., storageservices, database services, electronic commerce services, utilitycomputing services, and so on. Service forecasts may include host typeforecasts for various types of hosts employed by the services. Hardwarefootprint forecasts may indicate actual and planned deployments fordifferent configurations of computing devices and/or racks of suchcomputing devices. Though potentially reliable for short-term networkcapacity planning (e.g., one to three months in the future), serviceforecasts may be unreliable for long-term network capacity planning dueto their high volatility. By contrast, hardware footprint forecasts maybe more reliable for long-term planning (e.g., six months or more in thefuture) but may not capture nearer-term information such as productlaunches.

Various embodiments of the present disclosure provide for a mixturemodel approach that incorporates both service forecasts and hardwarefootprint forecasts in order to generate combination forecasts fornetwork capacity planning. A mixture model approach is desirable becauseit accommodates the different reliabilities from the different types offorecasts. In other words, a mixture model approach will draw mostheavily from the most accurate source for a given time period, whiledrawing least from inaccurate sources. The mixture model approach may beutilized to forecast computing devices of different configurationsbehind a layer of the network. Using historical data, network trafficmay be estimated for the forecasted computing devices. The estimatednetwork traffic may then be used to forecast network capacity needs.

With reference to FIG. 1, shown is an example scenario 100 illustratingthe use of a mixture model for network forecasting according to variousembodiments. In the example scenario 100, a hardware footprint forecast103 and a service forecast 106 may be provided in order to generate acombination network forecast 109. As used herein, a “forecast” mayencompass a specific predicted value or a predicted time series ofvalues. The hardware footprint forecast 103 may indicate the currenthardware footprint status of existing data centers as well as projectedchanges in existing data centers and additions of new data centers. Inthis non-limiting example, “data center 1” and “data center 2” each havean existing hardware footprint of “1024 racks.” “Data center 3” has aprojected hardware footprint of “4096 racks,” and “data center 4” has aprojected hardware footprint of “8192 racks.” Although the hardwarefootprints are discussed in this example in terms of racks, the hardwarefootprints in other examples may be provided in terms of devicecapacity, floor area, utility capabilities, and/or other metrics for adata center. In one embodiment, the hardware footprint forecasts 103 aremanually created by engineers or other design specialists in charge ofmanagement and build out of data centers.

The service forecast 106 in this non-limiting example is a host typeforecast for a number of hosts of a given type over time. The serviceforecast 106 may have a relatively great volatility, e.g., the hosts ofa given type in use may vary widely. The service forecast 106 may dependon various events such as product launches, pricing changes, and otherevents that may impact service usage. Various network services in anorganization may employ specific types of hosts, and forecasts createdfor several different services may be aggregated to provide a forecastfor specific types of hosts. Such service forecasts 106 may be providedin terms of network traffic and/or hardware usage (e.g., hosts of agiven host type, racks containing hosts of a given host type, computingdevices of a given type, and so on).

The output of the mixture model as will be described may be acombination network forecast 109. The combination network forecast 109takes into account both the hardware footprint forecast 103 and theservice forecast 106 to generate a combination forecast for networktraffic or network capacity over time. To this end, the combinationnetwork forecast 109 may take into account multiple service forecasts106 relating to multiple different types of hosts. In the followingdiscussion, a general description of the system and its components isprovided, followed by a discussion of the operation of the same.

Turning now to FIG. 2, shown is a networked environment 200 according tovarious embodiments. The networked environment 200 includes a computingenvironment 203 in data communication with a plurality of data centers206 a . . . 206N by way of a network 209. The network 209 includes, forexample, the Internet, intranets, extranets, wide area networks (WANs),local area networks (LANs), wired networks, wireless networks, cablenetworks, satellite networks, or other suitable networks, etc., or anycombination of two or more such networks.

Although the discussion herein refers to data centers 206, it isunderstood that the principles of the present disclosure may applysimilarly to availability zones. Availability zones may correspond to adistinct location of computing devices that is engineered to beinsulated from failures in other availability zones. In one embodiment,an availability zone may correspond to a data center 206. In otherembodiments, an availability zone may correspond to a floor, a portionof a floor, a rack, or another location within a data center 206.Because each availability zone is configured to fail independently ofthe other availability zones, each availability zone may be provided,for example, with a distinct generator or other backup power source, adistinct connection to the power grid, a distinct connection to thenetwork 209, distinct equipment facilitating power and/or network 209connections, distinct heating and/or cooling equipment, distinct fireprotection, and/or other features. Thus, multiple availability zones maybe housed within a single data center 206 or separate data centers 206depending in part on the available resources at a data center 206.

The computing environment 203 may comprise, for example, a servercomputer or any other system providing computing capability.Alternatively, the computing environment 203 may employ a plurality ofcomputing devices that are arranged, for example, in one or more serverbanks or computer banks or other arrangements. Such computing devicesmay be located in a single installation or may be distributed among manydifferent geographical locations. For example, the computing environment203 may include a plurality of computing devices that together maycomprise a hosted computing resource, a grid computing resource, and/orany other distributed computing arrangement. In some cases, thecomputing environment 203 may correspond to an elastic computingresource where the allotted capacity of processing, network, storage, orother computing-related resources may vary over time. In someembodiments, the computing environment 203 may comprise one or moreclient computing devices such as, for example, desktop computers, laptopcomputers, personal digital assistants, cellular telephones,smartphones, set-top boxes, music players, web pads, tablet computersystems, game consoles, electronic book readers, or other devices withlike capability.

Various applications and/or other functionality may be executed in thecomputing environment 203 according to various embodiments. Also,various data is stored in a data store 212 that is accessible to thecomputing environment 203. The data store 212 may be representative of aplurality of data stores 212 as can be appreciated. The data stored inthe data store 212, for example, is associated with the operation of thevarious applications and/or functional entities described below.

The components executed in the computing environment 203, for example,include a network measurement service 215, a mixture model forecastingengine 218, and other applications, services, processes, systems,engines, or functionality not discussed in detail herein. The networkmeasurement service 215 is executed to provide network usage data foruse in generating network forecasts. Specifically, the networkmeasurement service 215 may be capable of querying various devices inthe network 209 to provide network utilization information for networklinks between nodes. Additionally, the network measurement service 215may be capable of determining network bandwidth consumed by variousservices and/or hosts operating in the data centers 206. The networkmeasurement service 215 may obtain network information via simplenetwork management protocol (SNMP) and/or other protocols.

The mixture model forecasting engine 218 is executed to generate networkforecasts by applying a mixture model to combine hardware footprintforecasts 221 with service forecasts 224. The output forecasts maycorrespond to a mixture model computing hardware forecast 227, a mixturemodel network usage forecast 230, a mixture model network hardwareforecast 233, and/or other forecasts. As will be described, the mixturemodel forecasting engine 218 is configured to generate such forecasts byapplying appropriate weighting to forecast models for each of severaldifferent host types employed in the data centers 206.

The data stored in the data store 212 includes, for example, hardwarefootprint forecasts 221, service forecasts 224, network usage history236, host type information 239, training data sets 242, mixture modelcomputing hardware forecasts 227, mixture model network usage forecasts230, mixture model network hardware forecasts 233, mixture modelconfiguration parameters 245, and potentially other data. The hardwarefootprint forecasts 221 may provide actual and projected hardwarecapacity at the data centers 206. For example, the hardware footprintforecasts 221 may indicate a number of racks or rack units and/or anumber of spaces for racks and/or rack units at each of the data centers206, including planned data centers 206 that have yet to be constructed.The hardware capacity described by the hardware footprint forecasts 221may correspond to generic hardware capacity or hardware capacityreserved for, or presently in use by, specific types of computinghardware.

The service forecasts 224 may provide usage forecasts for variousservices offered by an organization via the data centers 206. Suchservices may correspond to data storage services, database services,utility computing services, electronic commerce services, and/or otherservices. The service forecasts 224 may be provided in terms ofpredicted network usage, predicted computing hardware usage (e.g.,racks, rack units, computing devices, etc.), predicted host usage, andso on. The service forecasts 224 may be provided in various dimensionsof usage, such as predicted usage of a network link between a firstnetwork node and a second network node, predicted usage of a specifictype of computing device, predicted usage of a specific type of host(which may be implemented using several different computing devicetypes), and/or other dimensions of usage.

The network usage history 236 may provide historical time series datafor usage of various links within the network 209. For example, thenetwork usage history 236 may describe historical bandwidth consumptionon a link between a first data center 206 and a second data center 206,or more generally, a link between a first node and a second node, wherenodes may correspond to load balancers, routers, switches, or othertypes of network hardware. The network usage history 236 may alsoindicate historical numbers of hosts in data centers 206 coupled to thenetwork 209.

The host type information 239 may provide data regarding various typesof hosts in the data centers 206. A host may correspond to an actualmachine or to a virtual machine. A virtual host is a virtualizedcomputer system, or a software implementation of a physical computingsystem. Virtual machines may provide for multiple and/or differentoperating system environments to run concurrently on a single systemhaving a processor circuit and a memory. As a non-limiting example,multiple instances of a Linux® operating system environment may executeconcurrently with multiple instances of a Microsoft® Windows® operatingsystem environment on a single system. Each host may be controlled bydifferent customers, who may have administrative access only to theirown instance(s) and no access to the instances of other customers.Multiple hosts may in fact execute concurrently on a computer systemincluding parallel processors, although multiple instances may appear toexecute concurrently on a multithreaded computer system with fewerprocessors than instances.

Different types of hosts may be available. For example, a first hosttype may be optimized for data storage, a second host type may beoptimized for computation, a third host type may be optimized for systemmemory, a fourth host type may be optimized for graphics processing, anda fifth host type may be a general purpose host. Each of these types ofhosts may be available in multiple sizes, e.g., small, medium, andlarge. As a non-limiting example, a large general purpose host may havefour CPU-equivalent units, 15 GB of system memory, and 1,000 GB of datastorage. A medium general purpose host may have two CPU-equivalentunits, 10 GB of system memory, and 500 GB of data storage. A smallgeneral purpose host may have one CPU-equivalent unit, 5 GB of systemmemory, and 250 GB of data storage. In one embodiment, a host maycomprise an allocation of an entire computing device with novirtualization.

It is noted that different types of hosts may be associated withdifferent network usage profiles. For example, a type of host that isfrequently used to implement a storage service may frequently implementsharding and assembling shards for storage. Such a storage service mayindicate a high rate of data transfer. By contrast, a type of host thatis optimized for computation may exhibit relatively little bandwidthconsumption.

In various embodiments, a customer may be capable of launching new hostsand/or terminating hosts dynamically. Thus, the data centers 206 mayprovide elastic computing capability to the customer that can vary overtime. As a non-limiting example, a customer hosting an infrequentlyvisited network site on a host may suddenly get an influx of networkpage hits when the network site is mentioned on television or linked ona popular network site. The increase in network site traffic mayoverwhelm the computing capability of the host, leading to poor networksite performance and availability. To cope with the network sitetraffic, the customer may launch new hosts and/or transition to a hostwith more resources and better performance.

The training data sets 242 include data employed for training variousnetwork and/or hardware usage models for the mixture model forecastingengine 218. The training data set 242 may include, for example, a subsetof the network usage history 236.

The mixture model computing hardware forecasts 227, the mixture modelnetwork usage forecasts 230, and the mixture model network hardwareforecasts 233 correspond to forecasts generated by the mixture modelforecasting engine 218 as will be described. The mixture model computinghardware forecasts 227 predict types of computing hardware (e.g., racksof computing devices, etc.) according to the mixture model. The mixturemodel computing hardware forecasts 227 may specify a respectivepredicted quantity for each of several different host types.

The mixture model network usage forecasts 230 predict network usage(e.g., peak bandwidth consumption) for various links within the network209. In one embodiment, the mixture model network usage forecasts 230may be generated using a mixture model computing hardware forecast 227and the network usage history 236. The mixture model network hardwareforecasts 233 predict network hardware to handle peak network usage.Such mixture model network hardware forecasts 233 may be based uponmixture model network usage forecasts 230 or may be calculated directly.Such mixture model network hardware forecasts 233 may predict quantitiesand types of network routers, load balancers, optical fiber connections,and/or other network hardware to provide sufficient network capacity tohandle the peak predicted network usage.

The mixture model configuration parameters 245 include variousparameters and/or other data that configure the operation of the mixturemodel forecasting engine 218. Such mixture model configurationparameters 245 may include threshold values, repetition parameters,weights, time horizons, and/or other parameters that are utilized by themixture model forecasting engine 218.

Next, a general description of the operation of the various componentsof the networked environment 200 is provided. To begin, a networkmeasurement service 215 executes and monitors the usage of the network209 by the data centers 206 over time. Consequently, network usagehistory 236 is generated. Engineers or other users may create hardwarefootprint forecasts 221 based at least in part on build-out schedulesfor data centers 206. Service forecasts 224 may be generated based uponnetwork usage history 236 for various services. Alternatively, theservice forecasts 224 may be provided by entitles in charge of thevarious services. The host type information 239 may be created basedupon computing devices available or planned in the data centers 206 andwhat types of hosts are available for those computing devices. One ormore training data sets 242 may be created based upon the network usagehistory 236, the hardware footprint forecasts 221, the service forecasts224, and/or other input data.

The mixture model forecasting engine 218 is then invoked to generate oneor more combination forecasts using a mixture model. Such forecasts mayinclude, for example, mixture model computing hardware forecasts 227,mixture model network usage forecasts 230, mixture model networkhardware forecasts 233, and/or other forecasts. The operation of themixture model forecasting engine 218 is controlled by the mixture modelconfiguration parameters 245.

Referring next to FIGS. 3A and 3B, shown is a flowchart that providesone example of the operation of a portion of the mixture modelforecasting engine 218 according to various embodiments. The portion asshown implements an adaptive regression by mixing with model screening.It is understood that the flowchart of FIGS. 3A and 3B provides merelyan example of the many different types of functional arrangements thatmay be employed to implement the operation of the portion of the mixturemodel forecasting engine 218 as described herein. As an alternative, theflowchart of FIGS. 3A and 3B may be viewed as depicting an example ofsteps of a method implemented in the computing environment 203 (FIG. 2)according to one or more embodiments.

Beginning with box 303 of FIG. 3A, the mixture model forecasting engine218 obtains a training data set 242 (FIG. 2) from the data store 212(FIG. 2) and splits the training data set 242 into a first portion and asecond portion. Such first and second portions may be disjoint sets. Inbox 306, the mixture model forecasting engine 218 generates networktraffic models for each host type as described in the host typeinformation 239 (FIG. 2) using the first portion of the training dataset 242.

In box 309, the mixture model forecasting engine 218 determines one ormore measures of quality for the network traffic models. For example,the mixture model forecasting engine 218 may determine error variancevalues for each of the network traffic models, Akaike informationcriterion (AIC) values for each of the network traffic models, Bayesianinformation criterion (BIC) values for each of the network trafficmodels, and/or other measures of quality. In box 318, the mixture modelforecasting engine 218 excludes any of the network traffic models thatdo not meet quality criteria (e.g., both AIC and BIC criteria) fromconsideration in the forecast.

In regards to AIC values, the mixture model forecasting engine 218 maycompute a delta AIC value for each network traffic model, which is theparticular AIC value minus the minimum AIC value for all of the networktraffic models. A delta AIC value less than 2 suggests substantialevidence for a particular model, while a delta AIC value greater than 10suggests that the model is very unlikely. In one embodiment, networktraffic models having a delta AIC of 10 or greater are excluded fromconsideration in the forecast. Likewise, delta BIC values may becomputed for each network traffic model. In one embodiment, networktraffic models having a delta BIC of 10 or greater are excluded fromconsideration in the forecast.

In box 321, the mixture model forecasting engine 218 assesses theaccuracy of the network traffic models by determining overall measuresof discrepancy using the second portion of the training data set 242.Corresponding network traffic models for each host type may be computedusing the second portion of the training data set 242, and the sum ofsquares error between the predictions of the network traffic models andthe first portion of the training data set 242 may be determined.

In box 324, the mixture model forecasting engine 218 determines relativeweights for each of the network traffic models for the host types. Indoing so, the mixture model forecasting engine 218 may calculate anumerator value for each network traffic model as follows:

${{variance}^{- n} \times ^{- {({{variance}^{- 2} \times \frac{discrepancy}{2}})}}},$

where variance is the error variance value for the model, n is anobservation from the first portion of the training data set 242, anddiscrepancy is the overall measure of discrepancy for the model. Therelative weight may then be derived by dividing the numerator value bythe sum of all the numerator values. In box 327, the mixture modelforecasting engine 218 may randomly permute the order of the data in thetraining data set 242.

In box 330, the mixture model forecasting engine 218 determines whetherto repeat the previous tasks. In one embodiment, the tasks of boxes303-327 may be repeated N times using the differently randomized data.If the tasks are to be repeated, the mixture model forecasting engine218 returns to box 303. If not, the mixture model forecasting engine 218continues from box 330 to box 333 of FIG. 3B.

In box 333 of FIG. 3B, the mixture model forecasting engine 218determines relative weights from the relative weights computed duringthe N iterations. For example, the mixture model forecasting engine 218may determine the average relative weights, the median relative weights,the maximum relative weights, the minimum relative weights, and so on,from the relative weights determined in box 324. In box 336, the mixturemodel forecasting engine 218 generates network traffic models for eachhost type using the entire training data set 242 (FIG. 2). In box 339,the mixture model forecasting engine 218 weights each resulting networktraffic model by the corresponding relative weights. In box 342, themixture model forecasting engine 218 generates a network forecast basedat least in part on a hardware footprint forecast 221 (FIG. 2) and theweighted network traffic models. Thereafter, the portion of the mixturemodel forecasting engine 218 ends.

As described, the generated forecast may be a mixture model networkusage forecast 230 (FIG. 2). However, in another embodiment, thegenerated forecast may be a mixture model computing hardware forecast227 (FIG. 2). In such an embodiment, the models generated may behardware usage models or host type usage models rather than networktraffic models. In another embodiment, mixture model network hardwareforecasts 233 (FIG. 2) may be generated. In one case, mixture modelnetwork hardware forecasts 233 may be generated from mixture modelnetwork usage forecasts 230 as input along with data correlating thetypes of network hardware to provide network capacity for the worst caseusage predicted by a mixture model network usage forecast 230.

With reference to FIG. 4, shown is a schematic block diagram of thecomputing environment 203 according to an embodiment of the presentdisclosure. The computing environment 203 includes one or more computingdevices 400. Each computing device 400 includes at least one processorcircuit, for example, having a processor 403 and a memory 406, both ofwhich are coupled to a local interface 409. To this end, each computingdevice 400 may comprise, for example, at least one server computer, atleast one client computer, or like device. The local interface 409 maycomprise, for example, a data bus with an accompanying address/controlbus or other bus structure as can be appreciated.

Stored in the memory 406 are both data and several components that areexecutable by the processor 403. In particular, stored in the memory 406and executable by the processor 403 are a network measurement service215, a mixture model forecasting engine 218, and potentially otherapplications. Also stored in the memory 406 may be a data store 212 andother data. In addition, an operating system may be stored in the memory406 and executable by the processor 403.

It is understood that there may be other applications that are stored inthe memory 406 and are executable by the processor 403 as can beappreciated. Where any component discussed herein is implemented in theform of software, any one of a number of programming languages may beemployed such as, for example, C, C++, C#, Objective C, Java®,JavaScript®, Perl, PHP, Visual Basic®, Python®, Ruby, Flash®, or otherprogramming languages.

A number of software components are stored in the memory 406 and areexecutable by the processor 403. In this respect, the term “executable”means a program file that is in a form that can ultimately be run by theprocessor 403. Examples of executable programs may be, for example, acompiled program that can be translated into machine code in a formatthat can be loaded into a random access portion of the memory 406 andrun by the processor 403, source code that may be expressed in properformat such as object code that is capable of being loaded into a randomaccess portion of the memory 406 and executed by the processor 403, orsource code that may be interpreted by another executable program togenerate instructions in a random access portion of the memory 406 to beexecuted by the processor 403, etc. An executable program may be storedin any portion or component of the memory 406 including, for example,random access memory (RAM), read-only memory (ROM), hard drive,solid-state drive, USB flash drive, memory card, optical disc such ascompact disc (CD) or digital versatile disc (DVD), floppy disk, magnetictape, or other memory components.

The memory 406 is defined herein as including both volatile andnonvolatile memory and data storage components. Volatile components arethose that do not retain data values upon loss of power. Nonvolatilecomponents are those that retain data upon a loss of power. Thus, thememory 406 may comprise, for example, random access memory (RAM),read-only memory (ROM), hard disk drives, solid-state drives, USB flashdrives, memory cards accessed via a memory card reader, floppy disksaccessed via an associated floppy disk drive, optical discs accessed viaan optical disc drive, magnetic tapes accessed via an appropriate tapedrive, and/or other memory components, or a combination of any two ormore of these memory components. In addition, the RAM may comprise, forexample, static random access memory (SRAM), dynamic random accessmemory (DRAM), or magnetic random access memory (MRAM) and other suchdevices. The ROM may comprise, for example, a programmable read-onlymemory (PROM), an erasable programmable read-only memory (EPROM), anelectrically erasable programmable read-only memory (EEPROM), or otherlike memory device.

Also, the processor 403 may represent multiple processors 403 and/ormultiple processor cores and the memory 406 may represent multiplememories 406 that operate in parallel processing circuits, respectively.In such a case, the local interface 409 may be an appropriate networkthat facilitates communication between any two of the multipleprocessors 403, between any processor 403 and any of the memories 406,or between any two of the memories 406, etc. The local interface 409 maycomprise additional systems designed to coordinate this communication,including, for example, performing load balancing. The processor 403 maybe of electrical or of some other available construction.

Although the network measurement service 215, the mixture modelforecasting engine 218, and other various systems described herein maybe embodied in software or code executed by general purpose hardware asdiscussed above, as an alternative the same may also be embodied indedicated hardware or a combination of software/general purpose hardwareand dedicated hardware. If embodied in dedicated hardware, each can beimplemented as a circuit or state machine that employs any one of or acombination of a number of technologies. These technologies may include,but are not limited to, discrete logic circuits having logic gates forimplementing various logic functions upon an application of one or moredata signals, application specific integrated circuits (ASICs) havingappropriate logic gates, field-programmable gate arrays (FPGAs), orother components, etc. Such technologies are generally well known bythose skilled in the art and, consequently, are not described in detailherein.

The flowchart of FIGS. 3A and 3B shows the functionality and operationof an implementation of portions of the mixture model forecasting engine218. If embodied in software, each block may represent a module,segment, or portion of code that comprises program instructions toimplement the specified logical function(s). The program instructionsmay be embodied in the form of source code that comprises human-readablestatements written in a programming language or machine code thatcomprises numerical instructions recognizable by a suitable executionsystem such as a processor 403 in a computer system or other system. Themachine code may be converted from the source code, etc. If embodied inhardware, each block may represent a circuit or a number ofinterconnected circuits to implement the specified logical function(s).

Although the flowchart of FIGS. 3A and 3B shows a specific order ofexecution, it is understood that the order of execution may differ fromthat which is depicted. For example, the order of execution of two ormore blocks may be scrambled relative to the order shown. Also, two ormore blocks shown in succession in FIGS. 3A and 3B may be executedconcurrently or with partial concurrence. Further, in some embodiments,one or more of the blocks shown in FIGS. 3A and 3B may be skipped oromitted. In addition, any number of counters, state variables, warningsemaphores, or messages might be added to the logical flow describedherein, for purposes of enhanced utility, accounting, performancemeasurement, or providing troubleshooting aids, etc. It is understoodthat all such variations are within the scope of the present disclosure.

Also, any logic or application described herein, including the networkmeasurement service 215 and the mixture model forecasting engine 218,that comprises software or code can be embodied in any non-transitorycomputer-readable medium for use by or in connection with an instructionexecution system such as, for example, a processor 403 in a computersystem or other system. In this sense, the logic may comprise, forexample, statements including instructions and declarations that can befetched from the computer-readable medium and executed by theinstruction execution system. In the context of the present disclosure,a “computer-readable medium” can be any medium that can contain, store,or maintain the logic or application described herein for use by or inconnection with the instruction execution system.

The computer-readable medium can comprise any one of many physical mediasuch as, for example, magnetic, optical, or semiconductor media. Morespecific examples of a suitable computer-readable medium would include,but are not limited to, magnetic tapes, magnetic floppy diskettes,magnetic hard drives, memory cards, solid-state drives, USB flashdrives, or optical discs. Also, the computer-readable medium may be arandom access memory (RAM) including, for example, static random accessmemory (SRAM) and dynamic random access memory (DRAM), or magneticrandom access memory (MRAM). In addition, the computer-readable mediummay be a read-only memory (ROM), a programmable read-only memory (PROM),an erasable programmable read-only memory (EPROM), an electricallyerasable programmable read-only memory (EEPROM), or other type of memorydevice.

Further, any logic or application described herein, including thenetwork measurement service 215 and the mixture model forecasting engine218, may be implemented and structured in a variety of ways. Forexample, one or more applications described may be implemented asmodules or components of a single application. Further, one or moreapplications described herein may be executed in shared or separatecomputing devices or a combination thereof. For example, a plurality ofthe applications described herein may execute in the same computingdevice 400, or in multiple computing devices in the same computingenvironment 203. Additionally, it is understood that terms such as“application,” “service,” “system,” “engine,” “module,” and so on may beinterchangeable and are not intended to be limiting.

Disjunctive language such as the phrase “at least one of X, Y, or Z,”unless specifically stated otherwise, is otherwise understood with thecontext as used in general to present that an item, term, etc., may beeither X, Y, or Z, or any combination thereof (e.g., X, Y, and/or Z).Thus, such disjunctive language is not generally intended to, and shouldnot, imply that certain embodiments require at least one of X, at leastone of Y, or at least one of Z to each be present.

It should be emphasized that the above-described embodiments of thepresent disclosure are merely possible examples of implementations setforth for a clear understanding of the principles of the disclosure.Many variations and modifications may be made to the above-describedembodiment(s) without departing substantially from the spirit andprinciples of the disclosure. All such modifications and variations areintended to be included herein within the scope of this disclosure andprotected by the following claims.

Therefore, the following is claimed:
 1. A system, comprising: at leastone computing device; and a mixture model forecasting engine executablein the at least one computing device, wherein when executed the mixturemodel forecasting engine causes the at least one computing device to atleast: generate respective network traffic models for individual ones ofa plurality of host types; determine respective weights for individualones of the respective network traffic models; and generate a networkforecast based at least in part on a hardware footprint forecast and onthe respective network traffic models weighted by the respectiveweights.
 2. The system of claim 1, wherein the hardware footprintforecast defines a predicted number of rack units in at least one datacenter.
 3. The system of claim 1, wherein the network forecast indicatesa measure of network capacity to handle predicted network trafficbetween a first data center and a second data center that are covered bythe hardware footprint forecast.
 4. The system of claim 1, wherein theplurality of host types include at least one of: a host type optimizedfor data storage, a host type optimized for computation, or a host typeoptimized for system memory.
 5. The system of claim 1, wherein whenexecuted the mixture model forecasting engine further causes the atleast one computing device to at least: determine respective Akaikeinformation criterion (AIC) values for individual ones of the respectivenetwork traffic models; and exclude at least one of the respectivenetwork traffic models from consideration in generating in the networkforecast based at least in part on the respective AIC values.
 6. Thesystem of claim 1, wherein when executed the mixture model forecastingengine further causes the at least one computing device to at least:determine respective Bayesian information criterion (BIC) values forindividual ones of the respective network traffic models; and exclude atleast one of the respective network traffic models from consideration ingenerating in the network forecast based at least in part on therespective BIC values.
 7. The system of claim 1, wherein when executedthe mixture model forecasting engine further causes the at least onecomputing device to at least: determine respective error variance valuesfor individual ones of the respective network traffic models; andwherein the respective weights are determined based at least in part onthe respective error variance values.
 8. The system of claim 1, whereinwhen executed the mixture model forecasting engine further causes the atleast one computing device to at least: determine respective overallmeasures of discrepancy using two disjoint training data sets forindividual ones of the respective network traffic models; and whereinthe respective weights are determined based at least in part on therespective overall measures of discrepancy.
 9. The system of claim 1,wherein the respective weights are determined based at least in part ona plurality of randomizations of a training data set.
 10. A method,comprising: generating, by a computing device, respective networktraffic models for individual ones of a plurality of host types;determining, by the computing device, respective weights for individualones of the network traffic models; and generating, by the computingdevice, a network forecast based at least in part on a hardwarefootprint forecast and on the respective network traffic models weightedby the respective weights.
 11. The method of claim 10, wherein thenetwork forecast is generated using a mixture model.
 12. The method ofclaim 10, wherein the network forecast comprises a time seriesspecifying a respective predicted quantity for each of the plurality ofhost types.
 13. The method of claim 10, wherein the plurality of hosttypes include a host type optimized for data storage, a host typeoptimized for computation, and a host type optimized for system memory.14. The method of claim 10, further comprising generating, by thecomputing device, the network forecast based at least in part onhistorical network traffic data associated with the plurality of hosttypes.
 15. The method of claim 14, wherein the network forecastindicates predicted network traffic between a first network node and asecond network node.
 16. The method of claim 10, wherein the hardwarefootprint forecast defines a predicted number of rack units in at leastone data center.
 17. The method of claim 10, further comprising:determining, by the computing device, respective Akaike informationcriterion (AIC) values for individual ones of the respective networktraffic models; determining, by the computing device, respectiveBayesian information criterion (BIC) values for individual ones of therespective network traffic models; and excluding, by the computingdevice, at least one of the respective network traffic models fromconsideration in generating the network forecast based at least in parton at least one of: the respective AIC values or the respective BICvalues.
 18. A non-transitory computer-readable medium embodying aprogram executable in a computing device, wherein when executed theprogram causes the computing device to at least: generate respectivenetwork traffic models for individual ones of a plurality of host types;determine respective weights for individual ones of the respectivenetwork traffic models; and generate a network forecast based at leastin part on a hardware footprint forecast and on the respective networktraffic models weighted by the respective weights.
 19. Thenon-transitory computer-readable medium of claim 18, wherein therespective network traffic models are generated using a first trainingdata set, the respective weights are determined based at least in parton respective overall measures of discrepancy for the individual ones ofthe network traffic models, and the respective overall measures ofdiscrepancy assess a respective accuracy of the individual ones of therespective network traffic models using a second training data set. 20.The non-transitory computer-readable medium of claim 18, wherein whenexecuted the program further causes the computing device to at leastgenerate a forecast for networking hardware to handle the networktraffic predicted by the network forecast.